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Abstract 

The Galerkin finite-element method is used to solve 
the Euler and Navier-Stokes equations on prismatic 
meshes. It is shown that the prismatic grid is ad- 
vantageous for correctly and efficiently capturing the 
boundary layers in high Reynolds number flows. It 
can be captured accurately because of the ability to 
cluster grid points normal to the body. The efficiency 
derives from the implicit treatment of the normal di- 
rection. To treat the normal direction implicitly, a 
semi-implicit Runge-Kutta time stepping scheme is 
developed. The semi-implicit algorithm is validated 
on simple geometries for inviscid and viscous flows and 
its convergence history is compared to that of the ex- 
plicit Runge-Kutta scheme. The semi-implicit scheme 
is shown to be a factor of 3 to 4 faster in terms of CPU 
time to convergence. 

Introduction 

Many methods have been attempted for both the 
generation of prismatic grids and solution of flows 
using prismatic grids. Prismatic meshes have been 
generated by optimization methods [1], by solution of 
hyperbolic PDEs [2], and by algebraic methods [3]. 
Unstructured tetrahedral meshes that have been cut 
from prisms have also been created using an algebraic 
approach developed by Lohner [4] and later modified 
by Marcum [5] as well as another approach developed 
by Pirzadeh [6]. Prismatic meshes have also been 
shown to be an efficient way to obtain inviscid [2, 7] 
and viscous [8, 9] solutions for compressible flows. 

In the past, however, explicit finite-volume meth- 
ods were used to solve the Euler and Navier-Stokes 
equations on prismatic grids with the exception of the 
work described in Ref. [7] where a semi-implicit line 
Jacobi method was used to solve the Euler equations. 
On unstructured tetrahedral grids, the linelet method 
of solving the problem semi-implicitly has also been 
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demonstrated [10, 11]. 

In the present paper, an improved method for treat- 
ing the structured direction in a prismatic grid in an 
implicit manner is proposed (Figure 1). The method is 
based on Runge-Kutta time stepping, which is similar 
to, but not the same as that used in Ref. [12] for struc- 
tured grids. The new time stepping method has better 
stability characteristics and as a result has better con- 
vergence properties [13]. The time stepping method 
was implemented in the Galerkin finite-element frame- 
work which was used for discretizing the Euler and 
Navier-Stokes equations. 



Fig. 1 Column of prisms on a triangulated surface 


The basic idea behind the semi-implicit method is 
as follows. If a fully implicit scheme is used, a ma- 
trix with a large bandwidth w r ould have to be solved. 
However, due to the structure in the body-normal di- 
rection of a prismatic grid, such a matrix always has a 
block-tridiagonal subset. If all the terms which are not 
contributing to this subset are dropped, then the prob- 
lem reduces to the solution of a 5 x 5 block tridiagonal 
matrix for three dimensional problems. This means 
that the standard Thomas algorithm [14] can be used. 
The use of this semi-implicit concept is embedded in 
a diagonally dominant manner in the Runge-Kutta 
multi stage scheme. 

The implicit treatment of the body-normal direction 
can be thought of as a line-Jacobi algorithm. However, 
unlike the work in Ref. [12], all terms contributing to 
the main diagonal from all directions are used in the 
solution of the block tridiagonal matrix system. Ac- 
counting for these terms is similar to a block-Jacobi 
method in the body-tangent directions. The block- 
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Jacobi method has better stability and convergence 
properties than explicit methods [15] which improves 
the stability and convergence properties of the present 
scheme. 

The present paper uses a Galerkin finite-element 
method for spatial discretization. The unsteady terms 
of the equations are discretized and lumped to the di- 
agonal to drive the solution to the steady state. It is 
in this artificial time stepping that both a standard 
explicit Runge-Kutta and the semi-implicit Runge- 
Kutta schemes are implemented. 

We first present the governing equations and discuss 
the Galerkin finite-element discretization of the spatial 
terms. We then discuss the temporal discretization 
and present the time stepping scheme. Comparisons 
between the explicit and semi-implicit schemes are 
shown for several inviscid and viscous test cases. These 
comparisons show that the semi-implicit algorithm is 
approximately 3 to 4 times more efficient than its ex- 
plicit counterpart for the solution of both Euler and 
Navier-Stokes equations. Several solutions are also 
presented as validation of the present algorithm. 


Governing equations 

The Navier-Stokes equations for describing viscous 
fluid flow in three dimensions can be written in integral 
form as 


HI ( d t Q)dV + JJJv-7dV = 0 (1) 


where Q is the state vector of density, momentum, 
and internal energy per unit mass. 

Q = (p,pu,pv,pw,e) (2) 


J = (F - F v )t + (G- G v )j + {H- H v )k (3) 

F t G , and H are the convective flux terms and F v , 
G VJ and H v are the viscous flux terms in x, y and z 
directions respectively. The pressure term in the flux 
functions is defined with respect to Q as follows. 

P = { 7 - l)(e - I p{u 2 + v‘ + w 2 )) 

For inviscid flow simulations, F v = G v = H v = 0. 
The viscous fluxes are defined in terms of the viscous 
stress tensor [16]. 

Spatial discretization 

The steady version of the Navier-Stokes equations 
can be written as 


V • JdV = 0 (4) 

where J is the flux function. By the weighted residual 
method, we multiply by a test function <j> and inte- 
grate by parts. The resulting integral equation can be 


expressed as a sum of the integrals over each prism. 



where e is the prism counter anil n e is the number of 
prisms. Q e is the prism itself and <9D e refers to the 
faces of the prism. 

The surface integral in equation 5 cancels at each 
internal face of the grid. At the boundaries of the do- 
main, the surface integral does not vanish. Therefore, 
this surface integral is evaluated at the boundary face 
and its contributions are accumulated in the terms for 
the vertices that make up the face. 

A standard Galerkin finite-element scheme with bi- 
linear shape functions is used to approximate the vol- 
ume integral terms. The flux function is evaluated 
with the shape functions N m as follows. 
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7 = J2 ( 6 ) 

m=l 


where 7 m is the flux at the vertex m and A r m is the 
associated shape function. 

Another choice for approximating a flux function for 
non-linear flux terms exists. For example if the flux 
function J = u 2 , we could approximate it as 



This second choice has been shown to not conserve 
mass [17]. 

Substituting the test function and equation 6 into 
equation 5, we get 


III 53 VJV m ■ Y, NnJndxdydz = 0 

Q e m=1 n=1 


( 8 ) 


Due to the compact support property of the shape 
function, the test function is only non-zero at the ver- 
tex m. So equation 8 yields one equation for every 
vertex in the mesh. 

The integral in equation 8 is evaluated on each prism 
and the appropriate contributions are distributed to 
the vertices of that prism. In order to accurately ap- 
proximate this integral, a six point quadrature rule is 
used on the right triangular prism (Figure 2) to evalu- 
ate the integral. The coordinates (£,??, 0 are the local 
directions on the right triangular prism. The values of 
the weights and the locations of the quadrature points 
are tabulated in [13] along with expressions for the 
coordinate transformation. 

Applying this coordinate transformation to equation 
8, we get the following integral on a right triangular 
prism. 

jjj[.J- l ]<7N m ■ N n 7 n \J\dtdT,d<; = 0 (9) 
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terms from this integral to the diagonal. This step 
is not necessary for semi-implicit or implicit sc hemes. 
The following time term results at every vertex m 



Fig. 2 Mapping a pentahedron to a right triangu- 
lar prism 
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Defining the integral term in equation 15 as AV t 
explicit and semi -implicit Runge-Kutta schemes are 
developed below. A p-step explicit time stepping 
scheme for taking Q from time n to time n + 1 can 
be written as 


where ViV = [TVc, N n , A\] T and J is the coordinate 
transformation Jacobian. 

Solving equation 9 at each vertex corresponds to 
the solution of a system of nonlinear equations. A 
common method of solving a nonlinear system is by 
linearizing the equations. We linearize equation 9 by 
Newton’s method using = J k + A k SQ where k 
can be thought of as an iteration index. 

After linearization, equation 9 becomes 

///[ [J • N n A n \J\d^dj]d(^ SQ n = (10) 

- HI [J~ l )VN m ■ N n 7 n \J\dSdT}d( 

The resulting system of equations can be written as 


Q (0) 

Q (k) 


Qn+l 
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Q n 
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{Q (k - l) -Q n ) 

Q(k=p) 


(16) 

(17) 


(18) 


where k refers to step number and ranges from 1 to p. 
The variable J? is the residual from the inviscid and 
viscous spatial terms of the equation. 

A implicit version of the same method is developed 
by taking the residual at the new step and linearizing 
with respect to the previous step. The A;th step of 
the linearized implicit Runge-Kutta procedure can be 
written as 


A5Q = £ (11) 

where A is the coefficient matrix which gets contribu- 
tions from the coefficient to 5Q on the left hand side 
of equation 11. We solve for the unknowns 5Q. Jl is 
the residual. 

Temporal discretization 

To implement a time-stepping scheme for the pris- 
matic grid in the Galerkin finite-element framework, 
the unsteady term of the Navier-Stokes equations 

/// (dtQ)dV (12) 

is multiplied by the shape function and subsequently 
the time derivative is approximated as 


6QW = (19) 

(Q {k ~ l) - Q n ) 

where SQ^ k ^ — Q^ k ^ — Q^ k ~ 1 \ and the bracketed part 
on the left hand side would form the left hand side 
matrix. 

For the semi-implicit scheme, we only retain the 
components of the flux Jacobian matrix which con- 
tribute to the tridiagonal part of the matrix. This 
results in a left hand side matrix with a 5 x 5 block 
tridiagonal structure. The scheme can be thought of as 
the line-Jacobi scheme for a single Runge-Kutta step 
with contributions to the diagonal from all neighbor 
points. 
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HI mQ)dV = IH <t>^dV (13) 

where 8q = Q n + l - Q n and n is the time step counter. 
Using a bilinear shape function and equation 6 with 
J = 5q, we approximate equation 13 as 

(///|"4H& - 
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where the TRID symbolizes that only the contribu- 
tions to the tridiagonal portion of the matrix are re- 
tained. 


Time step computation 


This integral results in a left hand side which once 
again has a sparse coefficient matrix. We lump all 
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To compute a conservative estimate of the time step, 
the monotonicity analysis done by Barth [18] is used. 



Fig. 5 Pressure on the ONERA M6 wing at Moo — 
0.84, a = 3.06, inviscid flow 

case was 9.1688 x 10~ 4 seconds/iteration/vertex. 

Inviscid flow over ONERA M6 wing 

The inviscid flow over a ONERA M6 wing is pre- 
sented to show the use of the prismatic grid method 
on a wing geometry at a transonic Mach number. Fig- 
ure 5 shows a solid surface plot of the pressure on the 
upper surface. A lambda shock structure is easily seen 
in this view. Figure 6 shows the C p comparison to the 
experimental data [21] and the convergence behavior 
of the explicit Runge-Kutta method. Due to the high 
Re number of the experiment which was 11 x 10 6 , we 
are able to compare the inviscid solution to the exper- 
iment. We find that the pressure at the surface was 
predicted by the present method with reasonable ac- 
curacy. The location of the shock was not captured 
precisely. The present solution agrees with the trend 
documented in inviscid flow literature such as Ref. [18]. 

Flat plate boundary layer 

To validate the viscous capability of the code, the 
flat plate boundary layer flow is computed and com- 
pared to the Blasius velocity profile. The grid for this 
test case is set up as four rows of triangles along the 
plate surface and then prism columns emanating into 
the direction away from the plate. The grid is highly 
stretched with cells near the surface having aspect ra- 
tios of the order of 750:1. 

A velocity profile is compared to the Blasius solu- 
tion [22] of the boundary layer equations in Figure 7. 
The numerical solution compares to the exact solution 
very well. 

Laminar viscous flow over ONERA M6 wing 

Finally, the viscous flow over a ONERA M6 wing 
is computed. Since a turbulence model is not im- 
plemented, we compute the flow around a ONERA 
M6 wing at a Re number of 50000. The test case is 


performed at a Mach number of 0.6 and zero angle 
of attack. The grid is composed of 171135 vertices 
and 325248 prisms. A c-typo grid topology was used 
to duster points in the wake region. Because no ex- 
perimental observations exist at this Re number, the 
pressure coefficient is compared to a numerical solu- 
tion from the OVERFLOW code [23] in Figure 8 at 
several stations along the span. The two numerical so- 
lutions agree very well. OVERFLOW is a structured 
grid compressible Navier -Stokes solver. 

Convergence studies 

Parabolic bump 

The semi-implicit method was first tested by solv- 
ing the Euler equations on a 10% parabolic bump. The 
comparisons were performed on an SGI Indigo2 with a 
R10000 processor on a stretched grid containing 4805 
vertices and 7200 prisms. The 2-, 3-, and 4-step semi- 
implicit Runge-Kutta methods are compared to the 
3- and 4-step explicit Runge-Kutta method. In these 
multi step methods, variations are possible where the 
left hand side matrix is not recalculated at all the 
steps [24]. For the 4-step Runge-Kutta method, we 
compare the scheme where the matrix is recomputed 
only at the first and third steps to assess the effect of 
not updating the left hand side. 

The convergence history for explicit and semi- 
implicit methods is presented in Figure 9 and is a clear 
indication that the four step semi-implicit Runge- 
Kutta method is the fastest method. A closer look 
at the numbers in Table 1 unveils several details. The 
Mem column reveals that the semi-implicit methods 
take up more than twice the memory of the explicit 
method. From the normalized CPU time consumed 
per iteration in column CPU we discover that the 
semi-implicit method can be more than twice as ex- 
pensive as the explicit methods. However, the Time 
column reveals that the overall efficiency (defined as 
time to convergence) of the semi-implicit schemes is 
up to 4 times better than the explicit methods. Fi- 
nally, not recomputing the matrix at every step slows 
convergence. 

The number of iterations and total CPU time re- 
quired to converge the solution to a correction of 10“ 14 
is reported in Table 1. 

Sphere 

To further verify these results, we perform another 
comparison of the 4 step explicit method and the 4 
step semi-implicit method. We use the inviscid flow 
around a sphere for this comparison. The prismatic 
grid used for this comparison is larger than the bump 
case with 18000 grid points to make sure the parabolic 
bump results were not due to the small problem size. 

The convergence history from the sphere test case is 
shown in Figure 10 and the CPU times are compared 
in Table 2. Once again, the semi-implicit method per- 
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X/C 



a) 20% span station 


b) 44% span station 




c) 65% span station 


d) 90% span station 



Correction History for the ONERA M6 Wing 


Yf=0.&4, a*J.« 



e) 95% span station f) Convergence History 

Fig. 6 C p at various span stations with comparison to experiment of Ref. [21] and convergence history 
on a ONERA M6 wing at = 0.84, a = 3.06, inviscid flow 
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R-K 

Steps 

Method 

CPU 

[s/iter] 


Mom 

[Kb] 

CFL 

Iterations 

HI 

Pg|g| 

CPU 


3 

Explicit 

L. 145 

0.79 

2843 

2.5 

20000 

22900 

1.27 

4 

Explicit 

1.445 

1 

2843 

4 

12500 

18062 

1 

2 

Semi implicit 

1.702 

1.17 

7081 

2.5 

1 1000 

18722 

1.04 

3 

Semi-implicit 

2.429 

1.68 

7081 

6 

2850 

6922 

0.38 

4 

Semi implicit(l,3) 

2.492 

1.72 

7081 

9 

2200 

5482 

0.3 

4 

Semi -implicit (Full) 

3.156 

2.18 

7081 

9 

1500 

4734 

0.26 


Table 1 Comparison of explicit Runge-Kutta schemes to the semi-implicit schemes for inviscid flow over 
a parabolic bump 


R-K 

Steps 

Method 

CPU 

[s/iter] 


Mem 

CFL 

Iterations 

Time 

[3] 

Time 

■■ 

Explicit 

9.766 

■ 




146488 

i 

■■ 

Semi-implicit(Full) 

22.377 




2500 

55943 

0.38 


Table 2 Comparison of the explicit scheme to the semi-implicit schemes for inviscid flow over a sphere 


AR 

R-K 

Steps 

Method 

CPU 

[s/iter] 

PM 






wiatm 

Time 

5 

4 

Explicit 

8.84 

1 

6888 

5 

13000 

114920 

1 

5 

4 

Semi-implicit 

18.38 

2.1 

20108 

20 

2500 

45950 

0.4 

10 

4 

Explicit 

17.82 

1 

13549 

4.5 

25000 

445500 

1 

10 

4 

Semi-implicit 

40.97 

2.3 

39601 

30 

3000 

122910 

0.27 


Table 3 Comparison of the explicit scheme to the semi-implicit scheme for viscous flow over a flat plate 
at Moo =0.5, and Re = 100000 


Velocity Profile (or Bow over a Rat plate 



Fig. 7 Velocity profile for flow over a flat plate at 

Moo = 0.5, Re = 100, 000 

forms better. The semi-implicit method is an order of 
magnitude faster by number of iterations and about 3 
times more efficient in terms of CPU time. 

Flat plate boundary layer 

The advantage of the semi-implicit method is also 
demonstrated for viscous flows. The simulation of 
the boundary layer flow over a flat plate is chosen as 
the first test case. The comparison is done with two 
separate uniform grids of cell aspect ratio 5 and 10 re- 
spectively to emphasize the effect of the aspect ratio on 
the rate of convergence obtained by the semi-implicit 
scheme. The cell aspect ratio here is computed based 


on the cell width divided by the cell height. The width 
is computed using the fact that the flat plate triangu- 
lation was made from a uniform structured grid. 

The aspect ratio 5 grid has a total of 15655 vertices 
and 24000 prisms, whereas the aspect ratio 10 grid is 
simply twice as dense in the direction normal to the 
plate. The convergence histories are shown in Figure 
11 and the statistics are tabulated in Table 3. The 
CFL numbers are based on the explicit time step in 
order to establish a trend with respect to the aspect 
ratio of the grid. 

A comparison of the statistics for the aspect ratio 5 
results to the aspect ratio 10 results shows that though 
the explicit method takes almost twice as long on the 
denser grid, the semi-implicit method takes approxi- 
mately the same number of iterations. The principle 
reason for this can be attributed to the fact that the 
same time step for both aspect ratio grids can be used 
for the semi-implicit method, while a smaller time step 
is required for the explicit method due to stability re- 
strictions. 

ONERA M6 wing 

Finally, the ONERA M6 wing test case is used to 
verify that the convergence rate improvement is also 
realized on a mesh with varying aspect ratios in dif- 
ferent parts of the grid. The viscous solution over the 
ONERA M6 wing is performed at = 0.6 at zero 
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C vs X for the ONERA M6 wing at 90% span 



c) 65% span station d) 90% span station 

Fig. 8 Pressure coefficient on an ONERA M6 wing at Moo =0.6, Re = 50000 


R-K 

Steps 

Method 

CPU 

[s/iter] 

CPU 

Mem 

[Kb] 



HkuH 

T ime 

4 

Explicit 

68 

1 

61776 

0.75 

9000 

EMilil 

i 

4 

Semi-implicit 

99 

1.46 

172317 

3.0 

1800 

178200 

0.3 


Table 4 Comparison of the explicit scheme to the semi-implicit scheme for viscous flow over an ONERA 
M6 wing at Moo =0.6, and Re = 50000 


angle of attack and a Re = 50000. The grid for the test 
case consists of 171135 vertices, and 325248 prisms. 
Although the cell aspect ratio varies in different parts 
of the mesh, near the surface of the wing they ranged 
between 100 and 1000. 


The comparison between the residual histories of 
the four step explicit and semi-implicit Runge-Kutta 
methods is shown in Figure 12, and the associated 
statistics are compiled in Table 4. The CPU times 
reported are for a single CPU on an Origin2000. Once 
again, the semi-implicit method is seen to have bet- 
ter convergence properties and is the more efficient 
method. 


Concluding remarks 

The use of prismatic grids allows for fast conver- 
gence and for accurate solution of the flow field. The 
finite element flux formulation is validated for invis- 
cid and viscous flows with a variety of aspect ratio 
grids. The inviscid flow over a sphere is presented and 
is in favorable agreement with the known potential so- 
lution. The inviscid flow over a ONERA M6 wing is 
computed at transonic Mach numbers. It is found that 
the pressure at the surface of the wing is in reasonable 
agreement with the experimental results [21] except 
for the location of the shock. 

For the validation of the prismatic grid method for 
viscous flows, the case of the flat plate boundary layer 
is compared to the analytical solution of Blasius [22]. 
It is found that the computed velocity profile is in 
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Comparison of explicit to scmi-implicit schemes 


for Inviscid flow over a 10'? parabolic bump at M=0 5 



Fig. 9 Convergence history comparison for invis- 
cid flow over a 10% parabolic bump at Moo = 0 5 


Convergence history for flow over a Sphere 



Fig. 10 Convergence history comparison for invis- 
cid flow over a sphere at Moo = 0.3 


Convergence comparison for Hat plate boundary layer flow 


M=0 5. Aspect Ratto=5 



Iterations 


a) AR — 5 grid 


Convergence comparison for flat plate boundary layer flow 

M»0.5, Aspect Ratio=10 



0 10000 20000 30000 

Iterations 


b) AR = 10 grid 

Fig. 11 Convergence history comparison for vis- 
cous flow over a flat plate at M 0 o = 0 5, and Re = 
100000 


respectable agreement with the analytic solution. Fi- 
nally, viscous flow simulation over the ONERA M6 
wing is presented and compared to the numerical so- 
lution from the OVERFLOW code. The pressure 
distributions at the surface were found to be in good 
agreement. 

Comparisons are made with test cases to analyze 
the convergence benefits of the semi-implicit method 
over the explicit methods. The semi-implicit method 
is 3 to 4 times faster than the explicit method and 
it becomes more useful with higher cell aspect ratios. 
This was shown by performing numerical experiments 
where the convergence rates for several test cases of 
varying sizes were used to compare the semi-implicit 
method to explicit methods. For the cases where the 
aspect ratio of the prism cells was carefully controlled, 
we see convergence acceleration consistent with that 
predicted by theory. Tables 2 and 4 reveal that for 
a mesh where the grid aspect ratio is not carefully 
controlled, the semi-implicit method is still faster than 


a traditional Runge-Kutta method by a factor of 3. 

Further work in assessing the usefulness of the semi- 
implicit schemes is still needed. The relationship of 
the semi-implicit methods to the line-Jacobi precon- 
ditioning methods of Allmaras [25] should be investi- 
gated. The possibility of implementing agglomeration 
multi-grid in the unstructured directions is also at- 
tractive. Another attractive method worth analyzing 
is the LU-SGS implementation by Nakamura [26] on 
unstructured grids and the subsequent application on 
prisms by Sharov [27]. A combination of this method 
with line relaxation should be developed. 
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